Part 1 - Background and methodology

¶ The United Arab List (RAM; in Arabic: The United Arab List, al-Qa'ima al-Arabiya al-Muhadda) is an Arab-Islamic party that was established in preparation for the elections to the 14th Knesset (May 1996). Since then, members of the list are individuals from among the southern faction of The Islamic movement, the Mada party and others. Ream is actually one of the three prominent Arab parties in Israeli politics.

The party list for the Knesset is chosen by the Shura Council of the southern faction in the Islamic Movement. The party enjoys great sympathy among the Bedouin public, and is considered the most pragmatic of the Arab parties in relation to cooperation with Jewish Zionist and even nationalist parties. Meanwhile, the party sat in the previous coalition under Naftali Bennett's Right Party and then under Yesh Atid's Yesh Atid.

The data files we used for the report:

Pre-processing we performed for the data:
  • Removal of external envelopes - throughout most of the work the data processing was without external envelopes (Part 3 onwards).
  • Creating a dictionary of the names of the parties - in order to convert the letters of the note into the names of the parties for presentation clear and informative of the data.
  • Invalid votes - these votes are not counted as part of the kosher votes in the elections.
  • HCA index identifier for the town ("Settlement symbol") and for the ballot box ("Settlement symbol"), "Ballot number", "Settlement name"
  • social-economic ranking - we made a comparison between the existing settlements in the cluster social-economic and the existing localities in the election files for the year 2022 and we made a cut according to the cities, so a settlement that did not appear in both files was removed from the data.
  • We used google API in order to obtain DD coordinates for cities in Israel, which was not available online in a working format.

Part 2 - Changes from last elections

¶

In the background of the last three election sets, in our opinion it is vital to note the entry of Mansour Abbas into the position of the chairman of the party in April 2019 under the name Ra'am - Balad.

Meanwhile, running for the Knesset as part of the 2020 elections (the 23rd Knesset), the Ra'am party was registered as part of the joint list and won 4 mandates out of 15 of the entire list.

However, during the election campaign for the twenty-fourth Knesset (March 2021), Chairman Mansour Abbas decided to lead the party to withdraw from the joint list and run independently.

We would be happy to open the analysis on the Raam party in a heat map of the amounts of votes for the party. The analysis is done by extracting the longitude and latitude coordinates with the help of Google API through a function we built.

After that, we cross-referenced the location data of the localities with the voting data for the party in each locality and created the heat map.

From the analysis of the heat map (interactive map) it is evident that most of the voters for Raam come from the Israeli periphery, from Arab and Bedouin communities.

Make this Notebook Trusted to load map: File -> Trust Notebook
The analysis of the graphs shows that the Ra'am party carried out a positive election campaign in 2022 compared to that of 2021, both in terms of the percentage of voters for the party (out of all voters) and in terms of the absolute amount.

This increase is also reflected in the number of mandates the party received in the last elections (5) compared to the previous elections (4).

However, it is important to note that the fact that Balad and Meretz parties did not pass the percentage of blocking in the 2022 elections compared to 2021, had a significant effect on Ra'am's "growth" in the number of mandates, as we will see in the following sections of the report concerning vote transfers between parties .

After all, these three parties belong to the left bloc, while the Balad and Ra'am parties have a Gordian connection and are considered to be two of the three almost exclusive parties that represent the Arab-Israeli voters.

Part 3 - Voting frequency changes

¶ Despite a certain dilemma regarding the double envelopes in the analysis we performed in this section, we decided to keep the data about the votes in the double envelopes. This, since in this section the analysis concerns general voting patterns for the party or broad in Israeli society, without the need for an exact distinction between polling stations or localities.
(<Figure size 640x480 with 1 Axes>,
 <AxesSubplot: title={'center': 'Votes percent םער party 2022'}, xlabel='םער', ylabel='Votes percent'>)
(<Figure size 640x480 with 1 Axes>,
 <AxesSubplot: title={'center': 'Votes percent םער party 2021'}, xlabel='םער', ylabel='Votes percent'>)
From the observation between the actual voting percentages and the potential voting percentages (after correction) it can be seen that in both the 21st and 22nd elections, The actual voting rate of the Ream Party is lower than the party's potential supporters.

Therefore, the party should and can concentrate its efforts in trying to "bite" a little into this gap and enter the vacuum.

Now, we will compare the actual voting rate to the party from the potential voting rate (the blue bar), and the general voter turnout in the State of Israel in that election year.
It is easy to notice that although the voting percentages for the party are higher than the general voting percentages, an increase in the percentage of general voters in the country may benefit the party.

This is because the party still has much room to grow from the potential (more than 20%).

Part 4 - Economic & geographic analysis

¶ In this section we will present how the frequency of voting and the absolute amount of voting for the רעם party varies between different localities and polling stations.

In addition, we will demonstrate analyzes in comparison to the other Arab parties in Israeli politics and we will demonstrate this through the support from among cities characterized as Jewish.

Finally, we will refer to the economic-social analysis and we will see the segmentation of the frequency of voting for the רעם party in the division into social economic clusters and we will try to diagnose a relationship between the percentage of voters for the party from a certain locality and the Gini index for inequality.

Here we present a general picture of the segmentation of the absolute amount of votes for the רעם party across the dozen settlements with the highest amount.

It is understood that this is a rather simplistic analysis, but in our opinion it was important to start from it in order to illustrate which localities have the highest absolute number of support for the party in order to show its centers of power in a quantitative - absolute sense.

In the last two charts, we made a comparison between the amount of voting for the רעם party and the amount of voting for the other Arab parties in Israeli politics, בלד, חדש תעל. Where the first bar plot shows the amounts of votes for the various parties among cities identified with Ream, while the second shows the amounts among cities identified with the בלד party.

Clearly, it is possible to distinguish the significant differences in the support received by the Ream party among smaller cities that are homogeneous in their religion and affiliation. This is in contrast to more "developed", mixed, and large cities, where the other parties receive significantly wider support.

Our message for רעם in this case is - At the same time as maintaining power in the small and homogenous localities, you must invest in advocacy and communication efforts in the mixed and larger cities in order to expand your support camp and "bite" into the audience of voters from the Arab sector who currently vote for other parties

Looking at the dozen localities with the highest relative frequency of voting for the רעם party, it is evident that most of them are very small localities that belong to the Bedouin population. Therefore, we will analyze the probability of voting for the רעם party only among cities with over 20,000 inhabitants
Analyzing the frequency of voting for the רעם party only among cities with over 20,000 inhabitants, it is evident that all the dozen cities with the highest frequency are Arab or mixed cities.

Although in some of them the rate of relative prevalence is not low, it is evident that the party has quite a bit of room to grow into it and increase the rate of support in these localities, certainly based on the previous analyzes that we presented in front of the other Arab parties.

It is not difficult to notice that the רעם party did not gain wide support among Jewish cities. The reason for this, of course, lies in the Islamic ideology of the party.

However, we must not ignore the fact that there is some support base for the party, which needs to be developed and expanded with different and adapted tools.

Comparing the frequency of voting between the Arab parties, רעם's weakness in attracting voters from Jewish communities, compared to the other Arab parties, is reflected. Along with the negative insight, there also lies an opportunity. There is a potential electorate in the Jewish communities that may switch to רעם.

Also, it can be noticed that while the percentage of support for רעם is relatively high in the Jewish peripheral cities (north and south), in the larger central cities such as Ramat Gan and Givatayim, רעם is pushed to the margins.

In a certain analogy to the insights from the analysis of the frequency of voting for Raam among the Jewish cities (the previous one), also from the analysis of the frequency of voting for the party according to socio-economic cluster, it can be seen that the percentages of support for Raam reside mainly among voters of low to medium socioeconomic status.
Most of the voters of the Raam party come from cities with a medium-low Gini index. That is, from settlements characterized by relatively low inequality. This is not a real surprise since the current analysis, similar to the insight from the previous graph: most Ream voters are associated with the low socio-economic status, from the socio-economic periphery in Israel and therefore their localities are characterized by a relatively egalitarian distribution of income since it is relatively low among all.

Part 5 - PCA analysis

¶ In this part, we will present the voting data from the 2022 elections by visualizing them through the two primary components in a PCA analysis. PCA analysis allows us to show the differences between the different ballots in a simple two-dimensional manner, while maximizing the differences between any two ballots, so that the farther two points are on the graph, the more different the nature of voters in them are.

We can interpret the primary components in the following manner:

  • PCA1: High values (on the right side) indicate a more conservative ideology on the political map, and vice versa.
  • PCA2: High values (upwards) indicate a Jewish nationality and low values indicate an Arab nationality.
It can be inferred that most of the votes that the party receives come from Arab villages that are more conservative in their stance, and from mixed villages with a lot of residents, and there are few votes from areas where the nation is more Jewish and conservative.
Now we will present the ballots in a similar way, but this time we will mark with color the percentage of votes that the party lost or gained from the 2021 elections
In the graph, it is possible to see the percentage of votes that the party lost in each ballot. The points marked in red represent the highest percentage of vote loss, and the points marked in pale green represent the ballots in which the number of votes increased by a high percentage.

It can be inferred that in the lower-right area of the graph, the party relatively gained more votes in areas considered more conservative ideologically and lost votes in more moderate areas. Additionally, the following mentioned villages, at least 25% of the votes were lost in at least one ballot compared to the previous year, and therefore it is recommended to focus on them in the next campaign: 'דייר חנא' 'לוד' 'מגאר' 'משהד' 'סחנין' 'עטאוונה שבט' 'רמלה'.

Principal Component Analysis (PCA) is a method used to simplify and visualize high-dimensional data. In this case, it was used to analyze voting data from the 2022 elections and present the results in a two-dimensional graph.

The advantage of PCA is its ability to effectively reduce the dimensionality of complex data while preserving its most important features, making it easy to visualize and understand. However, one disadvantage is that it can be sensitive to outliers and may not always produce the most accurate results. Alternative methods for data visualization and analysis include t-SNE, MDS, and LLE. The results obtained from PCA analysis are considered credible wheb the data is preprocessed and cleaned properly and the assumptions of the PCA are met.

Parts 6,7 - Votes transfers

The plot helps to draw a good estimation for how voters shifted between parties between the election of 2021 and the election held in 2022. An explanation of the estimation method can be found further down.

Each "flow" indicates a pattern of voting between the two elections. Hovering over these flows pop up a text describing the specific change in voting – how much from the voters who voted to party j in 2021 voted to a certain party in the election of 2022.

So the numbers refer to the percentage of voters who share the same pattern of voting – voting to one party X in 2021 and voting to party Y in the election of 2022. If a party managed to keep all voters from elections 2021 in 2022, it will result in a flow equals 100% (1 in the plot).

We omitted every voting trend (a flow in the plot linking between a party in 2021 to a party in 2022) that is smaller than 0.5% - that seemed to be a good threshold as 0.5% out of the total number of 'רעם' voters only sum up to less than 1000 people (out of ~167,000-194,000) in both elections and we want to focus on the main, important trends.

From the bigger plot, depicting all changes in voting patterns for all parties between the two elections (the first graph), we filtered out the other parties and focused on 'רעם', in two different plots.

The first describes how 'רעם' voters in 2021 voted in 2022, in other words – the distribution of 'רעם' voters in 2021. The plot indicates that 89% of 'רעם' voters in 2021 remained 'רעם' voters also in 2022. 8.1% percent of the voters in 2021 have decided to vote to 'בלד' in 2022, and 3.2% voted in 2022 to 'חדש תעל'. In total that accumulates to around 18,900 votes. Besides that, the party didn't lose any votes (that are summing up to a larger number than half percent of the party's voters).

22_עבודה 22_הבית היהודי 22_יהדות התורה 22_בלד 22_חדש תעל 22_ציונות דתית 22_המחנה הממלכתי 22_ישראל ביתנו 22_ליכוד 22_מרצ 22_רעם 22_יש עתיד 22_שס 22_לא הצביעו
21_רעם 0.0 0.0 0.0 0.080528 0.031526 0.0 0.0 0.0 0.0 0.0 0.887946 0.0 0.0 0.0
The second plot describes where 'רעם' voters in 2022 came from (what was the voting distribution of 'רעם' voters in 2022 in the elections of 2021). Besides the unsurprising fact that around 89% of 'רעם' voters in 2022 are also 'רעם' voters in 2021, 'רעם' were contributed also by former 'מרץ' voters (~7.6%), and by 'תקוה חדשה' (4%) who had merged to 'כחול לבן' but lost many votes in the process to other parties, including 'רעם'. Another significant trend is 3.7% of 'רעם' voters in 2022 did not vote at all in 2021. We know the voting rate went up in this elections from 67.44 to 70.63, it seems that 'רעם' benefited from the situation. it is important to say that the dissolution of 'הרשימה המשותפת' di not contribute a significant number of new votes to 'רעם".
The results these plots rely on are shown in this heat-map, which describes the estimated votes transfer from parties in 2021 election (rows) to parties in 2022 election (columns):
the numbers are calculated from a linear regression problem to estimate the votes transfer over the data collected from the ballots (except 'מעטפות חיצוניות'), and they check the voting differences per every ballot between the two elections.

The estimated values obtained by this formula, which minimizes the regression problem as we saw in class: $\widehat{M} = [N^{(a)^{T}} N^{(a)}]^{-1}N^{(a)^{T}}N^{(b)} $

Part 8 - Significant votes transfers

Since different parties and ballots may have errors with different variances, the estimation of each entry of the matrix showing how voting trends changed may be noisy, and we need to assess their uncertainty in order to draw conclusions on the true (unknown) parameters. We use bootstrap then, to make a better estimation of the variance of in order to make a more reliable hypothesis test as it is a more general and robust method for estimation.
The yellow squares indicate vote transfers that are significant for $ \alpha = 0.001 $. 'רעם' party has significant voting transfer with the votes transferred from 'רעם' to 'בלד', 'רעם', and 'חדש=תעל'. And on the other direction, meaning where 'רעם' voters came from in the election of 2022, a significant voting transfer exists for 'רעם', 'מרץ' and 'הליכוד' - so we do see a small change in compare to the former results

now, we want to check whether our model of predicting voting transfers is good in compare to other Parties. We will use bootstrap again, for the same reasons. Now we will split the data to 80% - train and 20% - test, which will be the data we will test how good is the prediction of our model, using MSE (describing distances between the real observations and the predicted values).

<AxesSubplot: >
Calculated both with and without non voters data ( labled "+NV" in the bar plot), in compare to other parties, the MSE for 'רעם' is relatively high. We want a smaller MSE as possible, so the prediction for vote transfers is not so good in compare to other parties. therefore we should take caution while adopting the findings of this chapter in this paper, therefore we tend to use rounded values here, and stick to the main, prominent trends shown in the results.

Part 9 - Suspicous ballots

To check suspicious ballots, first we will verify that there are no ballot with more votes than people with voting rights. After doing so, checked the distance between the voting frequency in a specific ballot and the voting frequency in the same city in 2022. The attached table shows the top ten ballots with the largest distance from the average in that city. For example, in the table you can see that ballot number 951 in the city of Ramla is more than 30% away from the average voting rate for Raam among all polling stations in Ramla, but since Ramla is a large and diverse city, such a situation is possible (for example, in a polling station located in a neighborhood with a Jewish population).
city ballot רעם Difference from avg Difference from prev year
city_name
רמלה 8500 951.0 0.699605 0.640822 0.066474
כאבול 504 10.4 0.825776 0.517081 0.000000
אעבלין 529 13.0 0.694444 0.453260 0.211518
חורה 1303 14.0 0.927536 0.401710 0.000000
סחנין 7500 33.4 0.664179 0.373586 0.000000
רהט 1161 60.0 0.852368 0.371321 0.241678
עכו 7600 80.0 0.437736 0.357219 0.166363
זרזיר 975 9.0 0.909420 0.352246 0.452328
שפרעם 8800 43.0 0.549020 0.347466 0.064269
תל שבע 1054 16.3 0.943548 0.330056 0.058615
An additional check that we performed is comparing the data to the 2021 election system. To find suspicious ballots in a clearer way, we created a graph that illustrates the difference from the average voting for the party in that city, and the deviation from that same ballot in the 2021 elections.
The size of the points represents the deviation from the average vote in that city, and the color of the point represents the deviation from the percentage of the vote in 2021. In addition, the x-axis represents the number of ballots in that city, so the more ballots in the city, the larger the city is likely, and the population there is more diverse, and therefore we can suspect these ballots less.

As a result, we want to look for yellow and large points, such as the point of "Zerzir", where the deviation from the average is 35% and the deviation from the previous year is more than 45%, which could indicate a suspicious ballot.

The method used for identifying suspicious ballots is based on comparing the voting frequency for a specific ballot to the average in the city, as well as the voting frequency for the same ballot in the previous year.

This approach allows for the identification of outliers, which may indicate potential fraud or manipulation. The advantages of this method include the simplicity of the calculation and the ability to identify specific ballots that deviate significantly from the norm. However, this method has some disadvantages as well. It assumes that the previous year's voting patterns are representative of the current year, which may not always be the case.

It also assumes that the average voting patterns in the city are representative of the overall population, which may not be true for smaller cities or for cities with diverse populations. Alternative methods for identifying suspicious ballots include statistical tests such as chi-squared tests. The results obtained from this method are credible, but a thorough investigation including other methods and evidence should be conducted to confirm or reject potential fraud or manipulation. Another method that can come to mind is calculating the MSE of each city, but since the fraud is happening in specific ballot and not city, it does not make sense to use MSE.

[NbConvertApp] Converting notebook Final report.ipynb to html
[NbConvertApp] Writing 11457701 bytes to Final report.html